1 Introduction 



The quantum and classical worldviews differ profoundly in two ways, both 
of which bear on the nature of observation. Briefly, if 21 and 25 are quantum 
systems whose states are represented in Hilbert spaces A and B then the states 
of the composite system 2125 are represented in the tensor product A® B. Only 
if the state of 2125 is a pure product state can 21 and 25 be truly regarded as 
separate entities. If 21 and 25 interact then in general a pure product state will 
evolve to a mixed state. This is in sharp contrast to the classical worldview, 
according to which 21 and 25 always have definite states, whether they interact 
or not. 

This entanglement of interacting systems becomes particularly interesting 
when the interaction has the nature of observation, and it is closely related to 
the other great novelty of the quantum theory, viz., the apparent stochastic 
nature of measurement in the quantum theory. We say 'apparent' because the 
conclusion that quantum observation is nondeterministic follows from the in- 
sistence that an observed system (5 and an observer O are in definite states 
following an observation. 

Suppose & and D are represented in Hilbert spaces U and W. Suppose for 
convenience that the measurement in question corresponds to a selfadjoint 
operator A on U with a discrete spectrum {Aj}. Let be the Aj-eigenspace 
of A, and let Pi be the orthogonal projection (OP) to U,. Then {Uj} is an 
orthogonal decomposition of U, and {Pi} is an orthogonal decomposition of 
the identity. In order that the interaction of & and O qualify as a measurement 
of A, final states of O corresponding to distinct eigenvalues of A must be 
distinguishable with certainty, i.e., they must be mutually orthogonal. Let 
Wj be the subspace of W corresponding to the value Aj. Suppose 9 G U and 
ip G W are unit vectors. A measurement interaction is defined as one in which 
an initial state 9 ® tp evolves to a state 

Y^PiO ®Vi = Y, e i®Vh (i) 



where, for each i, tpi is a unit vector in Wj, and we have defined 9i = Pi9 for 
notational convenience. 

The Copenhagen interpretation states that if O is an observer in some sense, 
e.g., if D is a person reading an experimental apparatus, then the final state 
of SO is not the sum (0) but is actually one of the pure product states 9i®tpi, 
and the probability that it is 9i® <{>% is \\9i ® ifi\\ 2 = \\9i\\ 2 . In this way the 
Copenhagen interpretation accounts for our subjective experience that when 
we perform a measurement as above we obtain a definite value; and it also 
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accounts for our empirical observation that if the measurement is repeated 
many times, the frequency with which each value Aj is observed is close to 
\\6i\\ 2 - Note that the interpretation of ||#i|| 2 as a probability is "written into" 
the Copenhagen interpretation, which is to say, it is taken as an axiom of the 
quantum theory. Following Farhi et al 4j, we refer to this as the probability 
postulate. 

The alternative to the Copenhagen interpretation is to take (JTJ at face value, as 
the actual final state of (3D. This is really no interpretation at all — being just 
a restatement with emphasis of the basic principles of the quantum theory 
of interacting systems — and hardly deserves a name, but its metaphysical 
implications are felt by some to be so counterintuitive that it must be — if not 
rejected outright — at least subjected to critical examination, for which purpose 
it has been given the evocative name of the many-worlds interpretation. 

The Copenhagen interpretation is useful in that it allows one to "get on" 
with the business of physics insofar as this is understood to mean making 
and confirming theoretical predictions of the results of experiments. As a de- 
scription of reality, however, it is defective, principally in that there is no 
objective criterion to decide when an interaction of physical systems proceeds 
according to the ordinary laws of quantum physics and when it undergoes the 
so-called wave-packet collapse described above. A vivid version of this conun- 
drum is Schrodinger's cat, who is either dead or alive inside a box depending 
on whether a radioactive nucleus has decayed, the question being whether the 
animal is really either one or the other before some higher form of life (read: 
the scientist performing this callous experiment) opens the box to observe the 
result. 

The manifest impossibility of defining which interactions are observations in 
the Copenhagen sense (quite apart from the evident impossibility of saying 
anything intelligible about the actual process of wave-packet collapse) would 
appear to rule out the Copenhagen interpretation as a model of reality. On 
the other hand, the principal (indeed, upon close analysis the only) objection 
to the many-worlds "interpretation" seems to be: "I don't know about you, 
but I'm certainly in a definite state". This sense that the many-worlds inter- 
pretation of is incompatible with experience appears ill founded on closer 
examination, however, inasmuch as any question posed of &0 — in the form of 
a selfadjoint operator to be measured — that has the answer 'yes' with certainty 
for each Q; t ® cp i: has the answer 'yes' with certainty for ® V 9 ^ an d vice 
versa. (If we let 'yes' correspond to the eigenvalue 1, then such an operator is 
the identity.) In other words, one cannot effectively ask the question: 'Is the 
world in a pure product state or in a superposition of such states?'. 

As far as the analysis of a single measurement is concerned, the many-worlds 
interpretation thus has no disadvantage, and since it has the advantage of 
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internal consistency it would seem to be the interpretation of choice. Note, 
however, that in the many- worlds view ||#j|| 2 is merely the weight of 9i <S> 
in J2i $i ® <£i, rather than the probability of &i ® ifii , as in the Copenhagen in- 
terpretation. Indeed, the many-worlds interpretation of a single measurement 
makes no statement at all regarding probability. 

The interpretation of ||0j|| 2 as a probability is of course based on the fact 
that in a series of repetitions of this measurement the cumulative frequency 
of outcome Aj approaches ||^|| 2 . This raises the following intriguing issue. In 
the many-worlds interpretation all possible outcomes of a measurement oper- 
ation are represented, and in the case of repeated measurements, all possible 
sequences of outcomes are represented. Does this not imply the certainty that 
whenever we carry out an experiment of this sort there are versions of us who 
witness sequences of outcomes wildly at variance with theoretical predictions? 
And are these versions not therefore led to reject the quantum theory solely 
because of bad luck? The answer of course is 'y es \ just as in the Copen- 
hagen interpretation the possibility is always present that relative frequencies 
of outcomes in a series of measurements will deviate significantly from theo- 
retical prediction. Note that in the Copenhagen interpretation this is only a 
possibility, while in the many-worlds interpretation it is a certainty. The prob- 
ability is small in the Copenhagen interpretation, just as the magnitude of 
the corresponding component is small in the many-worlds interpretation. The 
difference is that while we have learned to feel comfortable ignoring possible 
outcomes that have small probability (indeed, our sanity depends on it), we 
do not have a rationale for ignoring components with small magnitudes — we 
cannot say that we are unlikely to have the corresponding experience, because 
some version of us certainly does have that experience. 

Our instinct when confronted with what we suspect to be a statistical deviation 
is to prolong the series of measurements until the observed frequencies come 
back into line. In the Copenhagen interpretation, by the strong law of large 
numbers, this happens with probability 1. Precisely, in an infinite series of 
measurements, the probability that the frequency statistic will not tend to its 
proper limit is 0. It is natural to ask whether this scenario may be modeled in 
the many-worlds interpretation and whether the component corresponding to 
improper limiting behavior of the frequency statistic may be shown to vanish. 
In fact, this would appear to be the essential question, a positive answer to 
which nullifies any remaining objection to the many-worlds view. 

Everett [2j discussed the interpretation of ||#i|| 2 as a probability in the many- 
worlds interpretation, and Hartle[7j and Graham|5J applied the weak law of 
large numbers to the frequency statistic in the context of finitely many obser- 
vations; but to actually derive the probability interpretation of the squared 
norm from prior principles one must apply the strong law of large numbers to 
infinite sequences of observables. 
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The simplest generalization of (UJ to A/" independent measurements involves 
the tensor product of iV + 1 statevector spaces, and the corresponding gen- 
eralization to infinitely many measurements involves the tensor product of 
infinitely many Hilbert spaces. As shown by von Neumann [9], such a product 
can be defined as a complex inner product space, but it is not separable — i.e., 
it has an uncountable set of pairwise orthonormal vectors. It thus lies outside 
the standard framework of the quantum theory. 



An analysis of the many-worlds interpretation for infinite sequences of observa- 
tions in terms the von Neumann product was carried out by Farhi, Goldstone, 
and Gutmann|4 a (see also 0). Unaware of the von Neumann construction, and 
a fortiori unaware of the work of Farhi et al. , the present author conceived and 
derived this result in the context of ordinary (separable) Hilbert space. Here 
(perceived) necessity was indeed the mother of invention, as the construction 
employed in this proof was the inspiration and starting point for the con- 
struction of the hidden-variables models presented in jS] — in which, as in the 
many-worlds setting, the apparent stochastic nature of observation is inherent 
in the model, which is not itself probabilistic. [8] demonstrates a natural cor- 
respondence between hidden-variables states and generic objects of the sort 
that figure prominently in the foundations of mathematics — a correspondence 
with intriguing ontological implications for hidden- variables states. 



As explained above, the novelty of the present analysis compared to |4|6j is 
that it takes place entirely within the conventional framework of quantum 
mechanics, in which physical states are represented by vectors in ordinary 
Hilbert space, as opposed to a nonseparable inner-product space in which 
there exists an uncountable set of pairwise orthogonal vectors. Cassinello and 
Sanchez- Gomez and Caves and Schackj2] have criticized [3], claiming that 
it does not succeed in its goal of deriving the probability postulate from prior 
quantum principles. We disagree with these authors and present a critique of 
their critique in Section El 



As a practical matter, it will be convenient to employ certain elementary con- 
ventions of modern set theory throughout this discussion; these conventions 
are summarized in the appendix along with the definitions and basic proper- 
ties of partial orders and boolean algebras. Theorems of a general nature are 
labeled 'Proposition'; proofs of these are straightforward or readily available 
in the existing literature, but some are nevertheless presented here for their 
pedagogic value. Theorems specific to the argument of this paper are labeled 
'Theorem'. 
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2 Propositional algebras 



Suppose V is a Hilbert space. We use the term proposition to refer to an or- 
thogonal projection or to its 1-eigenspace, i.e., its image, according to context. 
Any closed subspace P of V is the image of a unique orthogonal projection, 
the orthogonal projection to P. We say that propositions P and Q are ortho- 
gonal to one another iff they are orthogonal as spaces, i.e., all vectors in P 
are orthogonal to all vectors in Q. Regarding propositions as projections, this 
is equivalent to imP _L imQ, and this is equivalent to either of the condi- 
tions: imQ C kerP or imP C kerQ, and therefore to either of the conditions: 
PQ = or QP = ('0' here denoting the zero operator). 

We define the complement ~P of a proposition P to be P 1 - = f {<p G V | (V0 G 
P)4> -L 4>}, regarding P as a space, or equivalently, 1 — P, regarding P as 
a projection. We say that propositions P and Q commute, PdQ, iff they 
commute as projections. If POQ we define the meet P A Q and join P V Q by 

PAQ = PnQandPVQ = span(P U Q), 

treating P and Q as spaces, where span is the linear span, i.e., closure under 
vector addition and scalar multiplication. Note that commuting propositions 
correspond to perpendicular spaces, spaces P and Q being perpendicular iff 
there exist pairwise orthogonal spaces A, B, and C such that 

P n Q = A, P = A © B, and Q = A © C. 

© is direct sum, i.e., the linear span of the union of two spaces whose inter- 
section is {0} ('0' here denoting the zero vector). 

Suppose 21 is a set of commuting propositions closed under the operations of 
complementation and meet (equivalently, join). Then 21 is a boolean algebra 
(see Def. El in the appendix), specifically a propositional algebra (PA). 

Note that if A is a set of commuting propositions and P, Q G A, then 1 — P 
and PQ commute with all members of A. Thus any set A of commuting 
propositions may be closed under complementation and meet to form the 
boolean algebra generated by A, which is the smallest propositional algebra 
that includes A. 

The meet A B of a set B of commuting propositions is f]{P \ P G £>}, which, as 
the intersection of closed spaces, is closed and is therefore a proposition. Note 
that if A is a set of commuting propositions and B C A, then A B commutes 
with all members of A. Thus any set A of commuting propositions may be may 
be closed under complementation and arbitrary meets (or joins) to form the 
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complete propositional algebra generated by A, which is the smallest complete 
PA that includes A. 



The paradigm of a boolean algebra is the collection &{A) of all subsets of 
a given set A. In this algebra meet, join, and complement are respectively 
intersection, union, and set-theoretic complement relative to A (~B = A\B). 
A subset of &{A) is a boolean algebra iff it is closed under complementation 
and intersection (and hence also union). We will be interested in countably 
complete subset algebras, which are closed under countable intersections (and 
countable unions). 

For the present discussion we are interested in the case A = w 2, the set of 
functions from uo to 2, where u = {0, 1, . . .} and 2 = {0, 1}. 1 For n G uo 
and e G 2, let J" = {/ G w 2 \ f{n) = e}. The standard topology on w 2 is 
defined by taking { JJ 1 | n G u, e G 2} as a subbase. That is, the open sets are 
arbitrary unions of finite intersections of J"s. The closed sets are those whose 
complements are open. The class 33 of Borel sets is the closure of the class of 
open (or closed) sets under the operations of complementation and countable 
intersection. 03 is clearly a countably complete boolean algebra. 

A map h : 21 — > <£ from one boolean algebra into another is a homomorphism 
iff it commutes with the complement operation and the binary meet operation 
(or, equivalently, the binary join operation), h is countably complete or com- 
plete iff it also commutes with countable meets and joins or arbitrary meets 
and joins, respectively. 

Theorem 1 Suppose P n , n G uo, are commuting propositions. There exists a 
countably complete homomorphism X \— > P(X) from the boolean algebra 23 of 
Borel subsets of u 2 to the complete PA generated by {P n \ n G u>} such that 
for each n G uj, 



PROOF. To extend (J2J) to a homomorphism on 03 we could proceed by 
repeatedly adjoining complements and countable intersections, defining P( UJ 2\ 
X) = ~P(X) and P(f] neuj X n ) = f\ neuJ P(X n ). To effect this construction 
we would have to show that at each intermediate stage, if we have families 
{X n \ neoj} and {Y n n 6 w} such that P(X n ) and P(Y n ) are defined for all 

1 The use of '2' to denote {0, 1} is standard in set theory when 0, 1, and 2 are 
ordinals. For certain purposes it is convenient to identify the ordinals and 1 with 
the real numbers with the same names, but in expressions like ia; 2', '2' denotes the 
set {0, 1}, not the real number 2. 




(2) 
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n G u, and f) n£uJ X n = f] neLU Y n , then Anew P(X n ) = f\ neuj P{Y n ). This can be 
done directly, but it is easier to use the spectral theory of selfadjoint operators 
on Hilbert space. 2 

Definition 1 For n G uo and a G n 2, i.e., for a a sequence of Os and Is of 

length n, define I a = C\ m<n J™ {m y Thus, I a = {/ G "2 | a C /}. 3 

Let 

P{Q = II P(J%m))- (3) 

Define a map p : w 2 -> [0, 3/2] by 

oo 

p(f) = E fW n - (4) 

n=0 



Note that for a G "2, p^/ ff C C a = [x a , x a + 3~ n (3/2)], where 

n-l 
m=0 



p is a homeomorphism of w 2 with the Cantor set formed from [0,3/2]. (The 
Cantor set of a closed interval is obtained by deleting the middle open third 
of the interval, then deleting the middle open third of each of the two remain- 
ing closed intervals, then deleting the middle open thirds of each of the four 
remainders, ad infinitum, and taking the intersection of all of the closed sets 
of this sequence.) Let P x = V{P(Q \ v e <w 2 k p^I a C [0, A]}. Let A be the 



selfadjoint operator defined by 



i 



A^l\dP x . (5) 



For any Borel function F : [0, 3/2] — > M, the integral / F(X) dP\ is well defined, 
and this is the standard definition of F(A). In particular, suppose X C w 2 is 

2 We are admittedly using an elephant gun to slay a gnat, but it gets the job done. 

3 Remember that functions are sets of ordered pairs, so a C / iff the finite sequence 
a is an initial segment of the infinite sequence /. 

4 Eq. (JSJ) means that for any ip G V, Atp is the limit of sums of the form 
J27=i^i( P K ~ P\i-i)ipi wit h A < Ai < ••• < A n , as A -> -oo, A n -> oo, and 
maxj = i n (Aj— Aj_i) — > 0, if the limit exists and is otherwise undefined. The spectral 
theorem states that the selfadjoint operators are exactly the operators of this form 
for monotone increasing OP-valued functions A \—* P\ for which lim^_ > _ 00 P\ = 
and lim^^oo P\ = 1. Such functions are called projection-valued measures. 
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a Borel set. Let Fx '■ [0, 3/2] — > R be the characteristic function of p^X, i.e., 

^(A)4 llf ^ (A)£A ' 
[0 if p- x (A) i X. 

Let 

P(A) = F x (A). 

P{X) is a selfadjoint operator with spectrum included in {0, 1}, so it is an 
orthogonal projection. One easily shows that the map X i— > P(X) is a homo- 
morphism of 23 with a PA such that (j2J) holds. Clearly the image of this map 
is the complete boolean algebra generated by {P n \ n E u}. □ 



From now on we suppose that (P n n € u) is an cu-sequence of commut- 
ing propositions, 23 is the complete PA it generates, and X i— > P(X) is the 
homomorphism of the Borel algebra 23 to 23 as given by Theorem ^ 

Let tp G V be an arbitrary nonzero vector. Define 

tot*) = \\P{XW. (6) 



One readily verifies that p^ is a measure on 23, i.e., 

(1) ^(0) = 0, 

(2) ^{X) > for all X e 23, and 

(3) if X = Unew X n and X m fl X n = for all m ^ n, then 

oo 
n=0 

Note that p^2) = ||^|| 2 . 

The foregoing situation has the following physical interpretation. A proposi- 
tion corresponds to a measurement with two possible outcomes: and 1. We 
may paraphrase the measurement corresponding to P as 'what is the value 
of PT or 'is P 1?'. The projections P n , n 6 u, correspond to simultaneously 
answerable questions about a physical state. For any Borel set X C u 2, P(X) 
corresponds to the question 'is the sequence of values of (P n \ n e uj) an 
element of A?'. 
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3 Probability 



Suppose ip is a normalized statevector and < q < 1. We say that ip is 
q-homogeneous for (P n | n G uj) iff for any n G w and any <r G n 2, 

||P(/^ ( i))^||=9||P(W||, (7) 



If ^ is g-homogeneous then /i^ is the q-homogeneous measure /i q , which is the 
measure on u 2 defined by the condition that 

n-l 
i=0 



for any n G u; and cr G "2, where 

50 = 1-9 (8) 
9l = g. (9) 

For example, ^ could represent the Heisenberg state of an electron that is sub- 
ject to alternating measurements of its spin along two perpendicular axes, P n 
representing the n th observation, in which case q — 1/2. For another example, 
if we imagine a system with infinite spatial extent, the P n s could represent 
judiciously chosen simultaneous measurements at different places. For a third 
example — which does not involve infinite extension in time or space — let ip be 
the state of a point particle uniformly distributed in a box B that extends 
along the x-axis from to 1. For n = 0, 1, . . ., and i — 0,1, let Bf be the 
portion of B consisting of points with ^-coordinates whose binary expansion 
contains % in the n th place. Let P n consist of determining whether the particle 
is in B™. The P n s commute — indeed, they are all functions of the position op- 
erator for the x-coordinate — and ip is ^-homogeneous for this sequence. Each 
of these examples is infinitary in its own way — the last in that it contemplates 
measurements of arbitarily high precision, but this is intrinsic to the notion of 
a continuous observable and is standard in physics; indeed, the simultaneous 
performance of all the P n s in this case is simply the measurement of x. The 
impossibility of actually performing infinitely many — or infinitely precise — 
observations in any effective sense should not be taken to invalidate this line 
of reasoning, any more than it invalidates applied probability theory, which in 
the final analysis is founded on the notion of an infinite sequence of trials. 

Let L q C w 2 be defined by the condition 

fEL q ^ lim F(N) = q, (10) 

N^oo 
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where 

iV n<N 

is the frequency statistic. 

Proposition 2 Strong law of large numbers 

Hq(Lq) = 1- 

Theorem 3 Thus, if ip is q-homogeneous then n^(L q ) = 1, whence, by 
P(L q )ip = ip, i.e., ip is a 1-eigenvector of the observable P(L q ), which repre- 
sents the question 

1 'does the frequency F(N) tend to q as N — > oo ?'. 
Hence the answer to this question, posed of ip, is 'yes'. 
Note that P{L q ) is in *P and commutes with all P G ^3. Hence 

P(L q )PiP = PP(L q )iP = (12) 

for any P G For ip G V, let 

[ifjf = span{PiP\Pe^}, (13) 

i.e., the closure of the linear span of the orbit of ip under the action of ^3. [i/j]^ 
is a mode 5 of the system in question and is the smallest mode that contains 
if) and all the states to which it may be projected as a result of observations 
in <p. (0 gives 

Theorem 4 P(L q ) restricted to [ip]^ is the identity, so the answer to the 
question^ posed of any state in [ip]^ is 'yes'. 

This corresponds to the fact that the first few values in an infinite sequence 
of trials (where 'few' may refer to any finite number) have no bearing on its 
behavior in the limit of infinitely many trials. 



A mode of a physical system is the set of states corresponding to a closed subspace 
of its statevector space. 
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4 Tests of randomness 



Note that there is nothing probabilistic in the statements of the preceding 
section. We are not saying that [T] is usually true or even almost always true, 6 
but rather that it is simply true. This is what it means to be an eigenstate 
of an observable. Isn't this a bit strange? Surely there is the possibility that 
F(N) q even though the probability is 0? Not in the quantum worldview, 
which is in this way more reasonable than the classical worldview. We will 
have more to say on this subject a little later. 

So where does probability come in? Probability is just a handy way of talking 
about positive measures with maximum value 1. In the preceding example, 
we may regard the sequences / G w 2 as arising from the random process V q 
that produces a sequence of 0s and Is, with the probability of producing a 1 
at any step being q. For any X C w 2, f-i q (X) is the probability that a sequence 
generated by V will be in X. L q may be regarded as a test of V = V q ' in the 
following sense. Suppose we are presented with a black box that produces a 
sequence of 0s and Is according to some process V. If we used this device to 
generate a sequence / : u> — > 2 and found that / ^ L q , we would be justified 
in concluding that the device was not operating according to the process V q , 
i.e., V 7^ V q . Any Borel X C ^2 with fi q (X) = 1 will similarly serve as a test 
of the supposition that V = V q , and P(X) is definitely true of any state in 
This was first observed by Gutmann^] in the nonseparable situation. 

Note that any countable collection of such tests may be applied simultaneously, 
because the intersection of any countable collection of sets of measure 1 has 
measure 1. Indeed we may imagine a grand test of "P = V q that is the 
intersection of every imaginable test of this supposition, since there are only 
countably many things we can imagine (in truth only finitely many, but let's 
be generous). Let G q C w 2 be this set. 7 



6 'Almost always' is used in measure theory to refer to sets whose complements 
have measure 0, as, for example, the rational numbers have measure as a subset 
of the real numbers with the usual Lebesgue measure. In probability theory — which 
is just measure theory when the total measure is 1 — 'almost always' means 'with 
probability 1', as, for example, a real number chosen at random according to a 
continuous probability distribution is almost always irrational. 

The use of imaginability as the criterion for inclusion in G q is admittedly some- 
what facetious. To be more objective we could replace it with definability in some 
sense. The objectivity of definability is something of an illusion, which is why we 
have emphasized 'in some sense': 'define' has to be defined. The potential difficulty 
is illustrated by Richard's paradox, which asks 'what is the smallest natural number 
not definable by an English phrase of fewer than sixteen words?'. Since there are 
only finitely many English phrases of fewer than sixteen words, there must be such 
a number, and there must therefore be a least such. But this number is definable 
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Then (i q (G q ) = 1, so P(G q )il) = if), and we see that the question 'does the 
sequence of outcomes of the observations P ,Pi,..., pass every imaginable 
test of "P = Vq T, posed of ■0, has the definite answer 'yes'. Thus we may 
regard the process of measurement as stochastic. 

The preceding example is only a special case of a general phenomenon. Suppose 
P = {P n | n G u} is an arbitrary family of commuting propositions, ^3 is the 
PA it generates, and ip is an arbitrary statevector. The probability measure 
fjE is in general not of the form fi q ; rather, the probability that P n — 1 
conditioned on the values of P m for one or more m ^ n, or more generally 
conditioned on the value of Q for any Q 6 is generally dependent on those 
values. Nevertheless, as before, we define to be the intersection of all /im- 
measurable T C "2 imaginable from P with fi^(T) = 1. Let X = G^. Then 
X serves as the canonical test of ^ for P as applied to 0, and the question 
'is the sequence of values of the P n s in X?', asked of i/i, is definitely answered 
in the affirmative. 

The generalization of the foregoing analysis to arbitrary processes of obser- 
vation (not necessarily binary) in the quantum theory is straightforward. We 
therefore conclude that the apparent stochastic nature of observation, with 
the squared norm as a probability, is a consequence of the quantum theory 
with observation treated like any other interaction, i.e., in the many- worlds 
"interpretation" . 

Interestingly, this aspect of quantum physics sheds light on one of the singular 
conceptual difficulties of probability theory, viz., that in a process like an 
infinite sequence of Bernoulli trials, any specific outcome has probability 0, 
yet some outcome does occur. Similarly, we may conceive of choosing a real 
number according to some continuous probability distribution. Each trial of 
this process yields a real number, say x, but the probability of this event, i.e., 
the measure of the set {x}, is 0. Of course, we have learned to live with this; 
for example, we understand that it is not appropriate to define an "event" for 
a given trial in terms of the result of that trial. Alternatively, we may stipulate 
that we will only consider events that are definable in some sense; since there 
are only countably many definitions, the union of all definable events with 
probability has probability 0, and we can be reasonably comfortable in the 



by the phrase 

the smallest natural number not definable 

by an English phrase of fewer than sixteen words, 

which has fifteen words: a contradiction. We return to this paradox in jSJ Section 
6.1], where we give its resolution as an important limitation on the definability of 
'define'. 
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assurance that we will never see a definable event with probability actually 
occur. 

In the quantum theory, however, the situation is a little different. Each "event" 
is associated with a selfadjoint operator, and an event like those just mentioned 
corresponds to the operator and simply does not happen. Returning to the 
scenario presented at the beginning of this article, recall that in answer to 
the question whether, in a large but finite series of observations, the outcome 
frequencies might not approximate their expected values, the Copenhagen in- 
terpretation says 'yes, but it's unlikely', while the many- worlds interpretation 
says 'yes, and it certainly happens for some version of us'. Let us now consider 
the question whether, in an infinite series of observations, it is possible for the 
outcome frequencies not to tend to their proper limits. To this the Copenhagen 
interpretation says 'yes, with probability 0', while the many- worlds interpre- 
tation says 'no, absolutely not'. 



5 Absolute continuity of measures 

In Theorem 0] we noted that for a g-homogeneous if) the question P(L q ) is 
answered affirmatively not just for if> but for any if)' £ [ip]^, the span of the 
orbit of if) under the action of More generally, every if)' 6 [ip]^ passes every 
test of /i^,, so that the statistical properties represented by these "tests" are 
really properties of the mode of the system represented by [if>]^, not of if> per 
se. Of course, in general if)' may pass additional tests with certainty, i.e., ^ 
may have null sets that are not null for n^. This section is devoted to an 
examination of the relation ip' 6 [if>]^ in terms of the measures [i^ and \x^>. It 
is not essential to the argument, but is of some interest in its own right. 

Suppose M is a nonempty set. A a-algebra on M is a countably complete 
boolean set-algebra of subsets of M (complementation being understood as 
relative to M). A measure on a a-algebra 21 is a map // : 21 — > [0, oo] such that 
/i(0) = and for any A , Ai, ... 6 21, if A m fl A n = for all m ^ n, then 



Note that we allow oo as a value of /i, the relevant arithmetic being that 
a + oo = oo for all a G [0, oo]. /i(M) is obviously the largest value [i can have. 

We say that /i is o-finite iff M is the union of countably many sets of finite 
measure; fi is finite iff /i(M) is finite; /i is a unit measure iff /u(M) = 1. If 21 is 
the algebra of Borel subsets of a topological space X (i.e., the closure of the 
class of open sets under complementation and countable union), we call /i a 
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Borel measure on X . We are particularly interested in the case that M is the 
space w 2 of functions from oo to 2 — i.e., a>sequences of zeros and ones — and 21 
is the Borel algebra 93 of this space; and we are most particularly interested 
in the case of unit Borel measures. 

If M and iV are sets with respective a-algebras 97t and 91 then a map / : M — > 
N is 97t, ^-measurable iff for any X G 91, f~X G 971. A function / : M -> w 2 
is ^-measurable iff it is 971, 23-measurable. 

Suppose /i is a a-finite measure on a <T-algebra 971 on a set M. We define the 
integral of a non-negative function as follows. For any / : M — > [0, oo), we 
define the integral f / d/j, by a simple modification of the method of Riemann 
sums familiar from calculus: instead of partitioning the domain of / we par- 
tition its range, i.e., [0, oo), and define / / dfj, as the unique x G K that lies 
between the lower sums 

oo 

^ x nl J '{f [ x n,%n+l]) 
n=0 

and the upper sums 

oo 

^ ^ii+iMI/'Ki x n+l)), 
n=0 

for all = xo < x\ < ■ ■ • such that linv^ooXn = oo. Note that the integral 
so defined may in general be infinite, although this case will not arise in our 
applications. As a rule, the integral with respect to a measure has all the nice 
properties of integration appropriate to this general context. 

Definition 2 Suppose fi and v are a-finite measures on a a-algebra 21 on 
a set M. v is absolutely continuous with respect to fi iff for all A G 21, if 
fi(A) = then v{A) = 0; otherwise, v is singular with respect to /i. 

Proposition 5 Radon-Nikodym theorem Suppose /j, and v are a-finite 
measures on a a-algebra 97t on a set M. v is absolutely continuous with respect 
to fi iff there exists a non-negative ^-measurable F : M — > R such that for 

all x em, 8 

u(X) = J F{x) X A {x)d^{x). 



8 The variable 'x' is inserted here to eliminate the ambiguity of expressions like 
'/s'j which might a priori refer to either x i— > f(x)g(x), [x,y] t— > f(x)g(y), or 
[x,y] f(y)g(x). 
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Now suppose P = (P n | n G u>) is a sequence of commuting orthogonal 
projections in a Hilbert space V. Let P(-) be the map from the set-algebra 
03 of Borel subsets of ^2 to the complete projection algebra 03 generated by 
(P n »6u) defined in Theorem [3 Note that if F : ^2 — > R is Borel we can 
define the integral 



in the usual way for projection-valued measures, as is done in the spectral 
theory of selfadjoint operators, except that / ranges over ^2 in the present 
case, rather than over M. Note that 



For each nonzero if> G V let /i^ be the corresponding measure defined by ©, 
so that for any Borel IC U 2 



||P(X)^|| 2 = ^(X)||^|| 2 . 

Suppose P G 03 and let ip' — Pty- Then for any X G 03, if }J-^(X) = 0, then 
P{X)ifj = 0, so 

P(X)V>' = P(X)Pip = PP(X)ip = 0, 

SO jJL^r(X) = 0. 



2 In oi/jer words, if if)' = Pip, then fi^ is absolutely continuous with respect 

tO jtty. 

In general, if S is a set of operators on a Hilbert space V, a vector if> G V is 
said to be cyclic for S in V iff V is the closure of the linear span of the orbit 
of if) under the action of the operators in S. In the present case, if> is cyclic for 
03 in [V>P but not in any larger subspace of V. 

Theorem 6 Suppose ip G V. For any if)' G [if)f^ , fi^> is absolutely continuous 
with respect to 

For any unit measure v on 03, if v is absolutely continuous with respect to //^ 
then for some if)' G [ipf®, v = \iy . 

PROOF. The first statement is a straightforward generalization of [2] To 
prove the second assertion suppose X is a Borel subset of w 2. Let x X '■ w 2 — > 



jF(f)a!P(f) 
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{0, 1} be its characteristic function. Then, as noted above, 

J X X dP = P(X), (14) 



whence it follows that for any ip £ V 



X x dP^ ^ =^{X)U\\ 2 = (/ X X diM*) IHI 2 (is) 

= (/ (x X Y d^\\n 2 - (16) 

The last transformation is a trivial one for characteristic functions, but it 
gives the proper form to generalize to an arbitrary non-negative Borel function 
G : w 2 -> R: 



J GdP^ip 



^ 'G 2 d^)Uf, (I 7 ) 



if either side is well defined (in which case they both are). This is proved by 
a straightforward application of limits of sums. 

Now suppose v is absolutely continuous with respect to fj,^. Let F : w 2 — > K 
be a non-negative Borel function such that for all Borel IC"2 9 

v{X)= [F(f) X x (f)d^(f) } (18) 



as guaranteed by the Radon-Nikodym theorem. As we have assumed v is a 
unit measure, 

FdP=l. (19) 



Note also that y F is well defined and is non-negative and Borel. Let 

tfj' = y ^/FdP^ip. (20) 

Applying (|17j) and (|19j) we find that 

( ./.|2 _ IL/.II2 (2i) 



9 We use '/'as the variable of integration as a reminder that the domain of inte- 
gration is not M. but rather w 2, which consists of functions from uj to 2 (= {0, 1}). 
In this context, however, these are best regarded as points, quite analogous to real 
numbers. 
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(Bounds relating to this identity may be used to show that the right side of 
(I20j) is actually defined.) Recall that for any Borel X C w 2, 

P(X) = J X X dP. 

The following computation is justified by the commutativity of all the pro- 
jections involved, and the usual manipulations of integrals in measure theory 
(derivable by taking to the limit the corresponding summation identities). 

P{X)ifj' = { / X X dp} {/ VF dp} if) (22) 

= {/ fiWx X {f)dP(J)}l> (23) 

Taking the squared norms of both sides and applying (|17|). noting again that 
= X X since x X is a characteristic function, we obtain 

Mx)\w\? = (/ F(f)x x (f)d^(f)) \m 2 - 

We now apply (|2*T|) to conclude that 
as desired. □ 



We continue to suppose that P = (P n \ n G to) is a sequence of commuting 
orthogonal projections in a Hilbert space V and that P(-) is the induced map 
from the algebra 23 of Borel subsets of w 2 to orthogonal projections on V. Let 
kerP be the kernel or nullspace of P(-), i.e., the set of Borel sets X C ^2 for 
which P(X) = 0. kerP is a countably complete ideal in 23. 

3 Clearly, for any Borel X C ^2, if X & kerP, i/ien fi^{X) = /or any 
^ G V; m oiaer words, ker P zs included in the null ideal of //^ /or even/ 
nonzero if) G V. Conversely, if X ^ kerP, taen n^{X) ^ /or some r/> G V, 
and V ^ e chosen (nonzero) so that P(X)ip = if), so that fi^(X) = 1. 

It is easy to construct examples for which ker P is exactly the null ideal of 
for some if) G V: simply let V = [f/'] 1 *'. Clearly these are the only examples, i.e., 
the condition VP G 23 (P 7^ =>- Pip 7^ 0) is equivalent to the condition that 
if) be cyclic for 2^. 
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As far as we know, it is an open question whether this condition is necessarily 
satisfied for some ip G V, but this has little bearing on the issues addressed in 
this paper. 



6 A critique critiqued 

As we noted in the introduction, a version of the preceding analysis was pre- 
sented by Farhi et al^\ in the setting of the tensor product V^°°^ = V <8> Vi <8> 
■ ■ • of infinitely many Hilbert spaces. Their analysis has been criticized by 
Cassinello and Sanchez-Gomez pQ and Caves and Schack[2] as not succeeding 
in its goal of deriving the probability postulate from prior quantum principles. 
In this section we examine this critique, referring specifically to the presenta- 
tion given in j2] . For the purpose of this commentary, in the interest of brevity, 
we assume familiarity with jl] and [2|. 

Following 4], we restrict our attention to the case that V is 2-dimensional. As 
noted in the introduction, although \A°°) is a complete inner product space, 
it is not separable, and is therefore by definition not a Hilbert space. Indeed, 
if we suppose for simplicity that = 1, and we let ipo — 4> an d let ipi be any 
normalized vector in V orthogonal to ip, then the vectors of the form 

® ® • • • > ( 24 ) 

where (jo, ji, • • •) is an infinite sequence of Os and Is, constitute an orthonormal 
basis for V(°°>. There are as many such sequences as there are real numbers; 
in particular, there are uncountably many. Adapting the notations of [I] and 
j2] for ease of reference, we indicate the basis vectors ipo — an d ipi in V ra by 
\a, 0) and \a, 1), respectively, and ([24)1 by |a; {j}), a being an arbitrary marker. 
(In |2] ip is used in place of a.) 

y(oo) j g canon j ca iiy the direct sum of separable subspaces, the components 
of V(°°), which are indeed Hilbert spaces. Let V^ 00 ^ be the component that 
contains \a; {0, 0, . . .}) = tp Cg) tp ® • • -. The vectors of the form \a; {i}), where 
{i} is a sequence of 0s and Is that contains only finitely many Is, constitute 
an orthonormal basis for V^ 00 ). {i} will always be understood to be a sequence 
with only finitely many Is. Note that there are only countably many such 
sequences. 

Let B be the binary observable on V in which we are interested, and let B n 
be B acting on V n or on the n th factor of \A°°) : 

B n (i> ® Vi ® • • = V'o ® • • • ® 4>n-l ® Bip n <g> ip n+1 
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In the latter sense B n corresponds to P n as used in the preceding sections. 
Let 1 6, 0) and \b, 1) be normalized 0- and 1-eigenvectors of B, respectively. Let 
p = |(a,0|6,0)| 2 and p x = l(a,0|M>| 2 = \(ip\b, 1)| 2 = \B4)\ 2 (so that p t is 
our q). For infinite sequences {j} = (jo, Ji, • • •) of 0s and Is, let \b; {j}) be a 
simultaneous eigenvector of all the B n s with B n \b; {j}) = j n \b] {j})- Note that 
1 6; {j}) is only defined up to a nonzero scalar factor. As shown in jl], \b; {j}) 
is not normalizable. 

We express the |a; {i})s in terms of the \b; {j})s by 

(a; {i}|*> = / dn({j}) (a; {i}\b; {j})(b; (25) 



where ^ is arbitrary, and the integration is over all sequences {j} according to 
a measure /i, which essentially determines the normalization of \b; {j})- This 
is (Til) on p. 374 of 0]. 

\b; {j}) has up to this point only been defined up to a scalar factor, and in jl] 
this arbitrariness is eliminated (in the parenthesis following (TI)) by setting 

(a;{0}|Mj}> = l- (26) 



(Here {0} is the sequence (0, 0, . . .). In |a; {0}) is called |{^}).) With this 
convention we have (TI) of [3]: 

(27) 



and this of course uniquely determines /i, which is shown in [4J to be the pi- 
homogeneous measure, which we would call fi Pl , i.e., the unit measure that 
assigns independent weights po,pi to j n = 0, 1. 

PP and [2] take exception to this argument, noting that the identification of 
fi Pl as the measure in (J23j) depends on the choice , which is not mandated 
by any consideration other than the desire to obtain /x pi . |2 a considers the 
consequences of omitting this step and defines a function {j} \— > c^({j}) by 

(a;{0}|6; {j}) = c m ({j}). 
The measure \i of (|23j) is then 

dK{j}) = \cm({j})\ 2 d^A{j}), (28) 

which, as pointed out in is rather arbitrary. We note, however, that this 
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H is absolutely continuous with respect to fx pi , i.e., any set of {j}s with /x Pl - 
measure has /i-measure 0. (We assume that |c{^}| 2 is a measurable function, 
which is necessary in order that the discussion make sense.) 

The error in |l|2j lies in supposing that the argument of jl] depends on // = ji pi . 
In fact it does not. The crux of the argument in 0] comes in the next section, 
where the frequency operator F corresponding to B is defined by 

F\b;{j}) = f({j})\b;{j}}, 
where 




The idea here is that f({j}) is defined for all {j}, and if liniTv^oo j?J2n=o 3n 
exists then f{{j}) has that value. 

By the strong law of large numbers, for a set of sequences {j} with \i px - 
measure 1, the limit does exist and is p\. It follows that the same is true 
for any measure absolutely continuous with respect to /j pu in particular for 
the measure /i defined by (|28|). which appears in ()25|) if we do not normalize 
according to (J26|) . Hence 

F\m) =pi|*) 

for any \1/ e V£°°), in particular for \a; {0}) = ij) ® if) ® • • •, which is the state 
of interest. In other words, a measurement of the proportion of ns for which 
the value of B n is 1 certainly yields p%. 

The same argument applies generally, and all states in V^, 00 ) certainly pass all 
tests of randomness. Thus the probability postulate does indeed follow from 
prior quantum principles in this setting, just as it does in the conventional 
setting as shown in Sections 01 and |U 

Before moving on, we remark that our discussion at the end of Section 0] is 
pertinent to Section 4.1 of [2., entitled "Certainty versus probability 1". The 
quantum theoretical representation of observables as operators on a (separa- 
ble) Hilbert space actually does make "certainty" equivalent to "probability 
1". Essentially, quantum observables in the usual sense do not possess suffi- 
cient discriminatory power to distinguish these notions. 
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Appendix 

is the empty set, i.e., the (unique) set with no members. We also use '0' to 
denote 0. 1 is defined as {0}, 2 is {0, 1}, 3 is {0, 1, 2}, etc. In other words, the 
natural numbers are defined in such a way that each natural number n is the 
set of numbers that precede it (of which there are n). These are also referred 
to as the finite ordinals, oo = f {0, 1, . . .} is the set of all finite ordinals. 

We regard a function as a set of ordered pairs, so that / = {[x, f(x)] | x G 
dom/}. dom/, the domain of /, is the set of first elements of pairs in / 
and im/, the image of /, is the set of second elements of pairs in /. We 
also represent / by (f(x) \ x G dom/). For n G u, a sequence of length 
n or n-sequence or n-tuple is a function with domain n. In the notation just 
introduced, if a is an n-sequence then a = (cr(m) \ m G n). We may also denote 
such a sequence a by indicating its elements as an explicit list between angle 
brackets. Thus a = (cr(0), cr(l), . . . , o~{n — 1)). Similarly, an u-sequence — also 
called an infinite sequence — is a function with domain u, and we may indicate 
such a sequence a by '(<t(0), <r(l), . . .)'. Concatenation of finite sequences is 
defined and indicated as follows: a~o' = (cr(0), . . . , a(n — 1), cr'(0), . . . , a'(n' — 
1)), where a and a' have length n and n', respectively. 

If / is a function and X is a set, 

/1X d ^{[z,/(i)]|x6Xndom/} 
f-X ^ {y\(3xeXn dom f)f(x) = y} 
fX = {yedomf\f(y)eX}. 

If A and B are sets we define A B (read U B pre A" ) to be the set of all functions 
from A to B. In particular, for n G u, n B is the set of functions from n into 
B, i.e., the set of n-tuples from B, and W B is the set of all infinite sequences 
from B. We define <W B to be U„ ew n B. 

Definition 3 A boolean algebra is a structure 21 = (|2t|, ~, V, A, 1, 0) with the 
following properties: 

(1) 0*1. 

(2) For all P,Q,Re |2l| 

P VP = P 
P\JQ = Q\J P 
P V (Q V R) 
P A (Q A R) 



PAP = P, 
P a Q = Q A P, 
= (P V Q) V P, 
= (P A Q) A P, 



21 



(P V Q) A R = (P A R) V (Q A P), 
(P A Q) V P = (P V P) A (Q V R). 

(3) For each P G |2l|, ~P £ne unique element of |2l| sttc/i i/ioi 

P V ~P = 1 and PA~P = 0. 

#J For allP,Qe |2l| 

~(pvq) = ~pa~q and ~(paq) = ~p v~g. 

We write £ P - Q' for £ P A (~Q)'. 
The order relation for 21 is defined by 
P< Q P = PAQ, 

or, equivalently, P < Q -<=>- Q = P V Q. PAQ is therefore the greatest 
lower bound and P V Q the least upper bound of {P, Q}- 

A boolean algebra 21 is complete iff if every subset of 21 has a greatest lower 
bound, which we call the meet of the set. Every set in a complete boolean 
algebra also has a least upper bound, which we call the join of the set. 21 is 
countably complete iff every countable set of elements has a meet (equivalently, 
a join) in 21. 

Elements of a boolean algebra 21 are compatible iff their meet is nonzero. 21 has 
the countable chain condition iff every set of pairwise incompatible elements 
is countable. 

Proposition 7 If a boolean algebra 21 is countably complete and satisfies the 
countable chain condition then 21 is complete. 



PROOF. The proof depends on the axiom of choice. On this assumption any 
set can be put in one-one correspondence with a von Neumann cardinal, i.e., 
an ordinal re such that there does not exist a function from any ordinal A < k 
onto k. We proceed by induction. Suppose joins exist for all sets of cardinality 
less than k, where k is a cardinal, and suppose X = {x a \ a G k} is a subset 
of 21. Let y a = (3 6 a} for each a G k. The elements x a — y a , a G re, 

are pairwise incompatible, so by the countable chain condition, only countably 
many of them are nonzero, so the join of the corresponding y a s exists. This is 
clearly the least upper bound of X. □ 
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